Audio/visual synching system and method

ABSTRACT

Access to a networked communication session is provided to each of a plurality of user computing devices that are each configured with at least a camera, audio input subsystem, and audio output subsystem. During the networked communication session and after an audio input of a first user computing device detects a user speaking, the volume of the audio output subsystem of the first user computing device is adjusted. Further, the volume of the audio input subsystem of each of the other user computing devices is adjusted. Furthermore, after an audio input subsystem of a second user computing device detects a user of the second user computing device speaking, the volume of the audio output subsystem of the second user computing device is adjusted, as is the audio input subsystem of the other respective user computing devices and the audio output subsystem of the first user computing device.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.16/192,691, filed Nov. 15, 2018, which is based on and claims priorityto U.S. Patent Application Ser. No. 62/586,850, filed Nov. 15, 2017.Further, U.S. patent application Ser. No. 16/192,691 is acontinuation-in-part of U.S. patent application Ser. No. 16/002,668,filed Jun. 7, 2018. Moreover, U.S. patent application Ser. No.16/192,691 is a continuation-in-part of U.S. patent application Ser. No.16/181,617, filed Nov. 6, 2018, which is a continuation of U.S. patentapplication Ser. No. 15/853,377, filed Dec. 22, 2017, now U.S. Pat. No.10,121,512, issued Nov. 6, 2018, which is a continuation of U.S. patentapplication Ser. No. 15/608,932, filed May 30, 2017, now U.S. Pat. No.9,852,764, issued Dec. 26, 2017, which is a continuation-in-part of U.S.patent application Ser. No. 14/813,974, filed Jul. 30, 2015, now U.S.Pat. No. 9,666,231, issued May 30, 2017. U.S. patent application Ser.No. 14/813,974 is based on and claims priority to U.S. ProvisionalPatent Application Ser. No. 62/031,114, filed on Jul. 30, 2014, and U.S.patent application Ser. No. 14/813,974 is a continuation-in-part of U.S.patent application Ser. No. 14/316,536, filed Jun. 26, 2014, now U.S.Pat. No. 9,363,448, filed Jun. 7, 2016. Further, U.S. patent applicationSer. No. 15/608,932 is a continuation-in-part of U.S. patent applicationSer. No. 15/247,534, filed Aug. 25, 2016, now U.S. Pat. No. 9,787,945,issued Oct. 10, 2017, which is based on and claims priority to U.S.Provisional Patent Application Ser. No. 62/209,727, filed Aug. 25, 2015,U.S. Provisional Patent Application Ser. No. 62/242,029, filed Oct. 15,2015, and U.S. Provisional Patent Application Ser. No. 62/329,081, filedApr. 28, 2016. U.S. application Ser. No. 15/247,534, further, is acontinuation-in-part of U.S. application Ser. No. 14/833,984, filed Aug.24, 2015, now U.S. Pat. No. 9,661,256, issued May 23, 2017, which is acontinuation-in-part of U.S. application Ser. No. 14/316,536, filed Jun.26, 2014, now U.S. Pat. No. 9,363,448, issued Jun. 7, 2016, which isbased on and claims priority to U.S. Provisional Patent Application Ser.No. 61/839,757, filed Jun. 26, 2013 and U.S. Provisional PatentApplication Ser. No. 61/845,743, filed Jul. 12, 2013, the entirecontents of all of which are incorporated by reference as if expresslyset forth in their respective entireties herein.

FIELD

The present application relates, generally, to content presentation and,more particularly, to a system and method for providing and interactingwith content via one or more interactive communication sessions.

BACKGROUND

Interactive and supplemental content that has been made available toviewers has been done through a decoupled, separate communicationchannel. For instance, a producer can provide a separate communicationchannel with data, a video stream, or both at a URL associated with thebroadcast. For example, a television station can have on-air programmingand also provide supplemental content available through a website. Apartfrom sponsoring both sources of information, these communicationchannels are generally decoupled from one another. In other words, thebroadcaster has only an indirect relationship to the viewer with regardto any supplemental content.

In interactive videoconferencing sessions, a plurality of participantsmay interact within a single location, such as a conference room. Insuch instances, some people may have the speaker of their respectivecomputing devices on while simultaneously speaking into their devices'microphones. This can result in audio disturbances, such as caused bythe microphone capturing the audio output of the speaker. Feedback, echoor other interferences can impede audio content of the interactivevideoconference, and negatively affect the experience for allparticipants. This further inhibits the ability for users tocollaborate.

BRIEF SUMMARY

In accordance with one or more implementations of the presentapplication, a system and/or method provide respective interactiveaudio/video content of each of a plurality of computing devices during anetworked communication session. At least one processor operativelycoupled to non-transitory processor readable media is configured therebyto cause the at least one processor to perform steps, including todefine the networked communication session. Further, respective accessto the networked communication session is provided to each of aplurality of user computing devices that are each configured with atleast a camera, audio input subsystem, and audio output subsystem. Acomposited interactive audio/video feed is generated that is comprisedof video and audio input received during the networked communicationsession from at least two of the respective user computing devices. Thecomposited audio/video feed is transmitted to each of the respectiveuser computing devices.

Continuing with this one or more implementations, during the networkedcommunication session and after an audio input subsystem of a first usercomputing device of the plurality of user computing devices detects auser of the first user computing device speaking, the at least oneprocessor executes instructions to adjust volume of the audio outputsubsystem of the first user computing device. Further, the volume of theaudio input subsystem of each of the other respective ones of theplurality of user computing devices is adjusted. Furthermore, during thenetworked communication session and after an audio input subsystem of asecond user computing device of the plurality of user computing devicesdetects a user of the second user computing device speaking, the volumeof the audio output subsystem of the second user computing device isadjusted, as is the audio input subsystem of each of the otherrespective ones of the plurality of user computing devices and the audiooutput subsystem of the first user computing device.

It is with respect to these and other considerations that the disclosuremade herein is presented.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects of the present disclosure will be more readily appreciated uponreview of the detailed description of its various embodiments, describedbelow, when taken in conjunction with the accompanying drawings, ofwhich:

FIG. 1 is a diagram illustrating an example hardware arrangement thatoperates for providing the systems and methods disclosed herein;

FIG. 2 is a block diagram that illustrates functional elements of acomputing device in accordance with an embodiment;

FIGS. 2A and 2B represent example computing devices configured with acamera and microphone, and associated operational aspects thereof;

FIG. 3 is an example high-level diagram that illustrates interactivitybetween various ones of the devices illustrated in FIG. 1;

FIGS. 4A and 4B illustrate operational aspects of computing devicesconfigured in accordance with an example implementation of the presentapplication; and

FIGS. 5A-5D illustrate example generation, configuration and utilizationof audio components in connection with an interactive video conferencingsession, in accordance with the present application.

DETAILED DESCRIPTION

By way of introduction and overview, in one or more implementations thepresent application provides systems and methods for providing andimproving interactive video conferencing over one or more datacommunication networks, such as the Internet. Devices operating, forexample, iOS, ANDROID, WINDOWS MOBILE, BLACKBERRY, MAC OS, WINDOWS orother operating systems are configured with one or more softwareapplications that provide functionality, such as with an interface fordeveloping (“authoring”) distributable coordinated presentations.Moreover, such devices can be configured with one or more softwareapplications that provide interactive video conferencing functionality.Using a respective interface, which can be a simple browser-basedapplication, users may interact with each other and share interactivevideos and other content as a function of touch and gestures, as well asgraphical screen controls that, when selected, cause a computing deviceto execute one or more instructions and effect various functionality.

Distributable coordinated presentations in accordance with the presentapplication can include interactive video having customizable andinteractive functionality for and between devices with a plurality ofend-users who receive the video. Further, the one or more softwareapplications configure a user computing device with aviewing/interactive tool, referred to herein, generally, as a“consuming” interface for end-users who receive interactive video thatare authored in accordance with the present application and usable forend-users to communicate (e.g., via interactive video conferencingfunctionality). Users may interact with each other and share interactivevideos and other content as a function of touch and gestures, as well asgraphical screen controls that, when selected, cause a computing deviceto execute one or more instructions and effect various functionality.For example, a smartphone or other mobile computing device can beconfigured via one or more applications in accordance with the abilityto simulate a laser pointer, drawing tool, mouse, trackball, keyboard orother input device.

In accordance with the teachings herein, implementations of the presentapplication provide a simple to use, informing and entertainingcommunications experience that incorporates content from a plurality ofcomputing devices, e.g., smartphones, tablets, laptops and desktops, andenable live sharing in a real-time and conferencing capabilitytherefore. In one or more implementations, various kinds of displaydevices, such as televisions, can be used for respective audio/visualdisplay. Such devices can provide feed from cameras and/or microphonesconfigured with various local and/or remotely located computing devicesthat are communicating over data communication networks such as theInternet. Display using a television can be implemented in the presentapplication in various ways, such as via an Internet media extenderprovided by APPLE TV, ROKU, AMAZON FIRE TV or GOOGLE CHROMECAST. As usedherein, an Internet media extender refers, generally, to a category ofdevices that provide for content to be streamed to a monitor and/ortelevision, surround sound devices, and the like. Unlike functionalityprovided by known Internet media extenders, however, the presentapplication facilitates integrating audio/video input capabilities ofcomputing devices (e.g., microphones, cameras and software that driveand enhance audio/visual captures) into video-conferencing capabilities.The present application facilitates one or more of: one-to-one (1:1)video conferencing; group video conferencing; sharing and/or viewing ofcontent provided on a plurality of computing devices, and interactivecomputing activities.

In one or more implementations, content, which can be formatted asand/or include images, audio/video content, website content, computerprograms and/or other content provided in various formats (collectivelyreferred to herein, generally, as “vApps”), can be implemented vis-a-visone or more mobile software applications. vApp icons can be providedthat represent respective vApps that are included with the conferencingsessions. In accordance with one or more implementations, after arespective icon is selected by a user, the user can interact with thevApp represented by that icon. Functionality, information and/or contentcan be associated with the vApp and provided in a shared conferencingsession, which is made available to user computing devices connectedthereto.

Thus, in one or more implementations, the present application providesfor interactive video conferencing that integrates audio/video input andoutput from individual mobile computing devices (e.g., smartphones andtablet computers) with Internet media extender devices (e.g., APPLE TV).By leveraging technology configured with mobile computing devices, e.g.,cameras and microphones, the present application provides a new form oflive and interactive functionality that can make a person's living roomor other residential viewing area into a high-end video conferencingsuite. Non-residential implementations are supported, as well, as shownand described in greater detail herein.

Accordingly, the present application provides online collaborativeservices, for example, including for webinars, webcasts, and meetings.In one or more implementations, Internet technologies such as TCP/IPconnectivity support web conferencing services, including sharing ofaudio, video, textual and various forms of multi-media content.

Referring to FIG. 1, a diagram is provided that shows an examplehardware arrangement that operates for providing the systems and methodsdisclosed herein, and designated generally as system 100. System 100 caninclude one or more data processing apparatuses 102 that are at leastcommunicatively coupled to one or more user computing devices 104 acrosscommunication network 106. Data processing apparatuses 102 and usercomputing devices 104 can include, for example, mobile computing devicessuch as tablet computing devices, smartphones, personal digitalassistants or the like, as well as laptop computers and/or desktopcomputers. Further, one computing device may be configured as a dataprocessing apparatus 102 and a user computing device 104, depending uponoperations being executed at a particular time. In addition, anaudio/visual capture device 105 is depicted in FIG. 1, which can beconfigured with one or more cameras (e.g., front-facing and rear-facingcameras), a microphone, a microprocessor, and a communications module(s)and that is coupled to data processing apparatus 102. The audio/visualcapture device 105 can be configured to interface with one or more dataprocessing apparatuses 102 for producing high quality and interactivemultimedia content, and supporting interactive video conferencing.

With continued reference to FIG. 1, data processing apparatus 102 can beconfigured to access one or more databases for the present application,including image files, video content, documents, audio/video recordings,metadata and other information. However, it is contemplated that dataprocessing apparatus 102 can access any required databases viacommunication network 106 or any other communication network to whichdata processing apparatus 102 has access. Data processing apparatus 102can communicate with devices comprising databases using any knowncommunication method, including a direct serial, parallel, universalserial bus (“USB”) interface, or via a local or wide area network.

User computing devices 104 can communicate with data processingapparatuses 102 using data connections 108, which are respectivelycoupled to communication network 106. Communication network 106 can beany communication network, but is typically the Internet or some otherglobal computer network. Data connections 108 can be any knownarrangement for accessing communication network 106, such as the publicinternet, private Internet (e.g. VPN), dedicated Internet connection, ordial-up serial line interface protocol/point-to-point protocol(SLIPP/PPP), integrated services digital network (ISDN), dedicatedleased-line service, broadband (cable) access, frame relay, digitalsubscriber line (DSL), asynchronous transfer mode (ATM) or other accesstechniques.

User computing devices 104 preferably have the ability to send andreceive data across communication network 106, and are equipped with webbrowsers, software applications, or other means, to provide receiveddata on display devices incorporated therewith. By way of example, usercomputing device 104 may be personal computers such as IntelPentium-class and Intel Core-class computers or Apple Macintoshcomputers, tablets, smartphones, but are not limited to such computers.Other computing devices which can communicate over a global computernetwork such as palmtop computers, personal digital assistants (PDAs)and mass-marketed Internet access devices such as WebTV can be used. Inaddition, the hardware arrangement of the present invention is notlimited to devices that are physically wired to communication network106, and that wireless communication can be provided between wirelessdevices and data processing apparatuses 102. In addition, system 100 caninclude Internet media extender 110 that is communicatively coupled totelevision 112, such as via a high-definition multimedia interface(“HDMI”) or other connection.

System 100 preferably includes software that provides functionalitydescribed in greater detail herein, and preferably resides on one ormore data processing apparatuses 102 and/or user computing devices 104.One of the functions performed by data processing apparatus 102 is thatof operating as a web server and/or a web site host. Data processingapparatuses 102 typically communicate with communication network 106across a permanent i.e., un-switched data connection 108. Permanentconnectivity ensures that access to data processing apparatuses 102 isalways available.

FIG. 2 illustrates, in block diagram form, an exemplary data processingapparatus 102 and/or user computing device 104 that can provide variousfunctionality, as shown and described herein. Although not expresslyindicated, one or more features shown and described with reference withFIG. 2 can be included with or in the audio/visual capture device 105,as well. Data processing apparatus 102 and/or user computing device 104may include one or more microprocessors 205 and connected systemcomponents (e.g., multiple connected chips) or the data processingapparatus 102 and/or user computing device 104 may be a system on achip.

The data processing apparatus 102 and/or user computing device 104includes memory 210 which is coupled to the microprocessor(s) 205. Thememory 210 may be used for storing data, metadata, and programs forexecution by the microprocessor(s) 205. The memory 210 may include oneor more of volatile and non-volatile memories, such as Random AccessMemory (“RAM”), Read Only Memory (“ROM”), Flash, Phase Change Memory(“PCM”), or other type. The data processing apparatus 102 and/or usercomputing device 104 also includes an audio input/output subsystem 215which may include one or more microphones and/or speakers.

A display controller and display device 220 provides a visual userinterface for the user; this user interface may include a graphical userinterface which, for example, is similar to that shown on a Macintoshcomputer when running Mac OS operating system software or an iPad,iPhone, or similar device when running iOS operating system software.

The data processing apparatus 102 and/or user computing device 104 alsoincludes one or more wireless transceivers 230, such as an IEEE 802.11transceiver, an infrared transceiver, a Bluetooth transceiver, awireless cellular telephony transceiver (e.g., 1G, 2G, 3G, 4G), oranother wireless protocol to connect the data processing system 100 withanother device, external component, or a network.

It will be appreciated that one or more buses, may be used tointerconnect the various modules in the block diagram shown in FIG. 2.

The data processing apparatus 102 and/or user computing device 104 alsoincludes one or more input or output (“I/O”) devices and interfaces 225which are provided to allow a user to provide input to, receive outputfrom, and otherwise transfer data to and from the system. These I/Odevices may include a mouse, keypad or a keyboard, a touch panel or amulti-touch input panel, camera, network interface, modem, other knownI/O devices or a combination of such I/O devices. The touch input panelmay be a single touch input panel which is activated with a stylus or afinger or a multi-touch input panel which is activated by one finger ora stylus or multiple fingers, and the panel is capable of distinguishingbetween one or two or three or more touches and is capable of providinginputs derived from those touches to the data processing apparatus 102and/or user computing device 104. The I/O devices and interfaces 225 mayinclude a connector for a dock or a connector for a USB interface,FireWire, etc. to connect the system 100 with another device, externalcomponent, or a network. Moreover, the I/O devices and interfaces caninclude gyroscope and/or accelerometer 227, which can be configured todetect 3-axis angular acceleration around the X, Y and Z axes, enablingprecise calculation, for example, of yaw, pitch, and roll. The gyroscopeand/or accelerometer 227 can be configured as a sensor that detectsacceleration, shake, vibration shock, or fall of a device 102/104, forexample, by detecting linear acceleration along one of three axes (X, Yand Z). The gyroscope can work in conjunction with the accelerometer, toprovide detailed and precise information about the device's axialmovement in space. More particularly, the 3 axes of the gyroscopecombined with the 3 axes of the accelerometer enable the device torecognize approximately how far, fast, and in which direction it hasmoved to generate telemetry information associated therewith, and thatis processed to generate coordinated presentations, such as shown anddescribed herein.

Additional components, not shown, can also be part of the dataprocessing apparatus 102 and/or user computing device 104, and, incertain embodiments, fewer components than that shown in FIG. 2 may alsobe used in data processing apparatus 102 and/or user computing device104. It will be apparent from this description that aspects of theinventions may be embodied, at least in part, in software. That is, thecomputer-implemented methods may be carried out in a computer system orother data processing system in response to its processor or processingsystem executing sequences of instructions contained in a memory, suchas memory 210 or other machine-readable storage medium. The software mayfurther be transmitted or received over a network (not shown) via anetwork interface device 225. In various embodiments, hardwiredcircuitry may be used in combination with the software instructions toimplement the present embodiments. Thus, the techniques are not limitedto any specific combination of hardware circuitry and software, or toany particular source for the instructions executed by the dataprocessing apparatus 102 and/or user computing device 104.

In one or more implementations, the present application providesimproved processing techniques to prevent packet loss, to improvehandling interruptions in communications, to reduce or eliminate latencyand other issues associated with wireless technology. For example, inone or more implementations Real Time Streaming Protocol (RTSP) can beimplemented, for example, for sharing output associated with a camera,microphone and/or other output devices configured with a computingdevice. RTSP is an effective (though not necessary in allimplementations) network control protocol for entertainment andcommunications systems, including in connection with streaming output.RTSP is used in the present application, at least in part, forestablishing and controlling media sessions between various end points,including user computing devise 104, Internet media extender 110 anddata processing apparatus 102.

In addition to RTSP, one or more implementations of the presentapplication can be configured to use Web Real-Time Communication(“WebRTC”) to support browser-to-browser applications, including inconnection with voice, video chat, and peer-to-peer (“P2P”) filesharing. Thus, the present application avoids a need for either internalor external plugins to connect endpoints, including for voice/video orother communication sharing. In one or more implementations, the presentapplication implements WebRTC for applications and/or Internet web sitesto capture and/or stream audio and/or video media, as well as toexchange data between browsers without requiring an intermediary. Theset of standards that comprises WebRTC makes it possible to share dataand perform teleconferencing peer-to-peer, without requiring that theuser install plug-ins or any other third-party software. WebRTC includesseveral interrelated APIs and protocols which work together.

In one or more implementations, at least one of the Internet mediaextender components 110 includes APPLE TV. After an Internet mediaextender 110 is installed (e.g., connected to a television set andconnected to a Wi-Fi, Ethernet or other local area network), a softwareapplication is installed on the Internet media extender 110, as well asat least one mobile computing device 104. For example, a user downloadsand installs an app to an Internet media extender 110 (“TV APP”) andalso installs an app to a user computing device 104 (“MOBILE APP”). Onceinstalled, and the first time the TV APP is executed, the user isprompted to launch the MOBILE APP. Thereafter, the mobile computingdevice 104 (e.g., an iPhone) is automatically detected by the TV APP.During subsequent uses, video content that is provided as a functionaudio/video output from the computing device (e.g., iPhone) is providedinstantly on the television that is connected to the Internet mediaextender 110. In operation, audio/video feed from the iPhone is providedon big screen. The TV APP and the MOBILE APP may be configured as asingle application (e.g., distributed as a single application), or maybe provided as separate applications.

In one or more implementations, each of a plurality of participantsoperating, for example, user computing device 104 participate in aninteractive video conference at least in part by establishing adata/communication session with the data processing apparatus 102. Aform of a star topology is established, in which data processingapparatus 102 is communicatively connected to each of a plurality ofrespective user computing devices 104 and respectfully receivesaudio/video feed from each device, such as provided as a function ofinput from a respective camera 225A and/or microphone 22B (FIG. 2A).

In one or more implementations of the present application, wirelessconnectivity, such as BLUETOOTH low energy (“BLE”), is configuredvis-a-vis the user computing devices 104 such as to implement a meshnetwork within a location, such as a conference room. In one or moreimplementations, an identifier (“ID”) representing a respective session,computing device and/or other suitable information is transmitted to andfrom devices within the location. Other information that is determinedand can be transmitted includes a respective level of volume that isbeing received by a respective microphone 225A. For example, amicrophone 225A of a computing device 104 operated by one person who isnot speaking receives low volume, while a microphone 225A of a computingdevice 104 operated by a person who is speaking receives higher volume.The information representing the respective input levels is processed,for example, by a single computing device 102 or 104, or by a pluralityof devices, e.g., 102 and/or 104, to cause a microphone 225A and orspeaker 225B to be adjusted (e.g., muted, turned up or down)substantially in real time. Using data representing input levels, thesoftware operating on one or more user computing devices 104 can causethe device to mute the microphone when a person is not speaking, andwhile audio output is coming from the speaker 225B. This would occur,for example, when a person operating a respective user computing device104 is simply listening to someone speaking. The respective computingdevice 104 of the person speaking, for example, referred to hereingenerally as the “active” device, can be configured to operate in theopposite way—such that the speaker 225B is adjusted, such as to beturned down or off, while the active device's microphone 225A is notmuted.

FIG. 2B illustrates an example active user computing device 104A, inwhich the microphone 225A is enabled and the speaker 225B is turned down(illustrated with an X through the speaker 225B). User computing device104B illustrated in FIG. 2B illustrates a user computing device 104Bthat is not active, in which the microphone 225A is disabled((illustrated with an X through the microphone 225A), and the speaker225B is operating (e.g., turned up).

FIG. 3 is an example high-level diagram that illustrates interactivitybetween various ones of the devices illustrated in FIG. 1, andidentifies example communication protocols in one or moreimplementations of the present application. The implementationillustrated in FIG. 3 is usable as a consumer (e.g., a residential)implementation, as well as an enterprise implementation. As illustratedin FIG. 3, WebRTC is shown with regard to communications between usercomputing devices 104 (shown as a CHROME BOOK and mobile computingdevice, e.g., a smart phone) and supporting browser-to-browserapplications and P2P functionality. In addition, RTSP is utilized inconnection with user computing devices 104 and Internet media extender110, thereby enabling presentation of audio/video content from devices104 on television 112.

FIGS. 4A and 4B illustrate operational aspects of computing devicesconfigured in accordance with an example implementation of the presentapplication. As shown in FIG. 4A, a plurality of computing devices 104(104A and 104B) are positioned around a conference room during aninteractive video-conferencing session. A single user computing device104A is the active device, and is shown circled. As the user of therespective active device speaks, the speaker 225B is adjusted to be lowor off, while the microphone 225A remains unmuted. The remaining devices104B operate in the opposite way—with the respective microphones 225Amuted and the respective speakers 225B operating (e.g., adjustedsuitably for listening). In FIG. 4B, the same interactivevideoconferencing session is shown with a different user computingdevice 104A as the active device (again, shown as circled). The abilityfor devices to switch between being active or not can be implemented invarious ways. For example, a rule set can be enforced in which a minimumamount of time (e.g., 1 second) of silence must exist for the activedevice 104A to operate as non-active device 104B. Alternatively, when asecond user begins to speak during the session, that user's computingdevice may become the active device 104A, thereby causing the previousactive device 104A to become non-active 104B. By controlling microphones225A and speakers 225B of the respective devices 104, the presentapplication eliminates echo, feedback, or other audio disturbances andprovides an improved audio experience. Various timing requirements,conditions or other rules sets can be defined and implemented inaccordance with one or more implementations of the present application.

In addition to adjusting the input levels of microphones 225A and outputlevels of speakers 225B, the present application includes technology forsynching audio content, including as a function of one or more BLEsignals. For example, audio content can be delayed and synched forinput/output on the user computing devices 104 and which alleviatesaudio disturbances, such as echo and/or feedback. In one or moreimplementations, an additional data layer can be generated and providedwith (e.g., “on top of”) an audio/video signal. For example, a signalcan be embedded into an audio signal such that the user computing device104 can interpret and use the signal to control operations of audioinput/output. For example, a virtual form of watermarking audio contentcan be provided for every 10 frames of audio/video content, that can beread digitally to facilitate synchronizing output. For example, awatermark can be provided every second that escalates by a factor of 30or so for every second, such that the respective frames of audio/videocontent being interpreted (e.g., read) at the respective devices 104 canbe determined.

FIGS. 5A-5D illustrate example generation, configuration and utilizationof audio (e.g., portions with frames) in connection with an interactivevideo conferencing session, in accordance with one or moreimplementations of the present application. FIG. 5A, for example,illustrates example audio content 502 that is configured with data(e.g., a form of watermarking). FIG. 5B illustrates portions of audiocontent 502A, 502B, 502C, 502D . . . 502′ that can be provided with, forexample, video frames, and that collectively can represent audio content502. Each portion can be provided with respective data (e.g., respectivewatermarks), which is usable by one or more computing devices 104 toidentify the respective frames. During playback, for example, a timedelay can be implemented to prevent audio content 502 from being outputon a respective speaker 225B substantially at the same time as the audiocontent is captured via a respective microphone 225A.

FIG. 5C illustrates an example time sequence in which no audio contentis output during a first portion of time and, thereafter, audio contentis output to a respective speaker 225B. For example, a user of arespective device 104 during an interactive video conferencing sessionbegins to speak and the respective microphone 225A captures the content.By implementing a delay in the output of the respective speaker 225B bya predefined amount, such as 1 second or more, the user's microphonewill not receive the output of the user's respective speaker 225 whilethe user speaking. Echo or feedback can, thereby, be reduced oreliminated. Moreover, and as described above, the output level of thespeaker 225B of the respective device of the user who is speaking can beadjusted to be increased or decreased to further reduce or eliminateaudio disturbance.

FIG. 5D illustrates a plurality of computing devices 104 that arepositioned around a conference room during an interactivevideo-conferencing session. A single user computing device 104 is theactive device, and is shown circled. As the user of the respectiveactive device speaks, the audio content captured by the microphone 225Ais output to the respective speaker 225B in a delayed fashion. In theexample illustrated in FIG. 5D, the audio input content portion (e.g.,what the user is saying at that time) may be 502′, while the audiooutput of the speaker 225B is only at 502A. Moreover, as a function ofthe BLE signals and/or other data captured by the respective devices 104during the session, the particular frames being played by eachrespective device 104 is known. For example, one device 104 is at 502D,another at 502B and another at 502C. This can occur, for example, as afunction of differences in bandwidth., processor performance and/orimplemented time delays as a function of operational configurations.

Thus as shown and described herein, audio disturbances, such asfeedback, echo, cross-talk or other disturbances are significantlyreduced or eliminated as a function of controlling audio input/outputlevels substantially in real-time. In addition or in the alternative,such audio disturbances can be controlled as a function of embedding orincluding data with audio content, such as via a form of audiowatermarks.

Although many of the examples shown and described herein regarddistribution of coordinated presentations to a plurality of users, theinvention is not so limited. Although illustrated embodiments of thepresent invention have been shown and described, it should be understoodthat various changes, substitutions, and alterations can be made by oneof ordinary skill in the art without departing from the scope of thepresent application.

What is claimed is:
 1. A system for providing respective interactiveaudio/video content of each of a plurality of computing devices during anetworked communication session, the system comprising: non-transitoryprocessor readable media; at least one processor operatively coupled tothe non-transitory processor readable media, wherein the non-transitoryprocessor readable media have instructions that, when executed by the atleast one processor, causes the at least one processor to perform thefollowing steps: define the networked communication session; providerespective access to the networked communication session to each of aplurality of user computing devices that are each configured with atleast a camera, an audio input subsystem, and an audio output subsystem;generate a composited interactive audio/video feed comprised of videoand audio input received during the networked communication session fromat least two of the respective user computing devices; transmit to eachof the plurality of user computing devices the composited audio/videofeed, a) wherein, during the networked communication session and afteran audio input subsystem of a first user computing device of theplurality of user computing devices detects a user of the first usercomputing device speaking, the at least one processor executesinstructions to: i) adjust volume of the audio output subsystem of thefirst user computing device; and ii) adjust volume of the audio inputsubsystem of each of the other respective ones of the plurality of usercomputing devices; and b) wherein, during the networked communicationsession and after an audio input subsystem of a second user computingdevice of the plurality of user computing devices detects a user of thesecond user computing device speaking, the at least one processorexecutes instructions to: i) adjust volume of the audio output subsystemof the second user computing device; ii) adjust volume of the audioinput subsystem of each of the other respective ones of the plurality ofuser computing devices; and iii) adjust volume of the audio outputsubsystem of the first user computing device.
 2. The system of claim 1,wherein the non-transitory processor readable media have instructionsthat, when executed by the at least one processor, causes the at leastone processor to delay steps b) i), b) ii), and b) iii) until after apredetermined amount of time during which none of the plurality of usercomputing devices detects any user speaking.
 3. The system of claim 1,wherein, during the networked communication session and after the audioinput subsystem of the first user computing device of the plurality ofuser computing devices detects the user of the first user computingdevice speaking, an instruction executes to: designate the first usercomputing device an active device; and designate each of the other usercomputing devices of the plurality of computing devices non-activedevices; and wherein, during the networked communication session andafter the audio input subsystem of the second user computing device ofthe plurality of user computing devices detects the user of the seconduser computing device speaking, an instruction executes to: designatethe second user computing device the active device; and designate thefirst user computing device a non-active device among the plurality ofother non-active devices.
 4. The system of claim 1, wherein theadjustment in steps a) i), a) ii), b) i), and b) ii) is to decreasevolume or to mute volume.
 5. The system of claim 1, wherein theadjustment in step b) iii) is to increase volume.
 6. The system of claim1, wherein the non-transitory processor readable media have instructionsthat, when executed by the at least one processor, causes the at leastone processor to: synchronize audio content and video content, includingby delaying providing at least one of the audio content and videocontent to at least one of the user computing devices.
 7. The system ofclaim 6, wherein the delaying occurs as a function of at least oneBLUETOOTH signal.
 8. The system of claim 6, wherein the synchronizingand/or delaying occurs as a function of at least data received in audio.9. The system of claim 8, wherein the data is usable by the at least oneprocessor and/or at least one of the plurality of user computing devicesfor a time delay that prevents output of audio and input of audiooccurring simultaneously on one of the plurality of user computingdevices.
 10. The system of claim 6, wherein the synchronizing occurs asa function of differences in available bandwidth between the respectiveuser computing devices.
 11. A method for providing respectiveinteractive audio/video content of each of a plurality of computingdevices during a networked communication session, the system comprising:defining, by at least one processor configured by executing instructionson non-transitory processor readable media, the networked communicationsession; providing, by the at least one processor, respective access tothe networked communication session to each of a plurality of usercomputing devices that are each configured with at least a camera, audioinput subsystem, and audio output subsystem; generating, by the at leastone processor, a composited interactive audio/video feed comprised ofvideo and audio input received during the networked communicationsession from at least two of the respective user computing devices;transmitting, by the at least one processor to each of the plurality ofuser computing devices the composited audio/video feed, a) wherein,during the networked communication session and after an audio inputsubsystem of a first user computing device of the plurality of usercomputing devices detects a user of the first user computing devicespeaking: i) adjusting, by the at least one processor, volume of theaudio output subsystem of the first user computing device; and ii)adjusting, by the at least one processor, volume of the audio inputsubsystem of each of the other respective ones of the plurality of usercomputing devices; and b) wherein, during the networked communicationsession and after an audio input subsystem of a second user computingdevice of the plurality of user computing devices detects a user of thesecond user computing device speaking: i) adjusting, by the at least oneprocessor, volume of the audio output subsystem of the second usercomputing device; ii) adjusting, by the at least one processor, volumeof the audio input subsystem of each of the other respective ones of theplurality of user computing devices; and iii) adjusting, by the at leastone processor, volume of the audio output subsystem of the first usercomputing device.
 12. The method of claim 11, further comprising:delaying, by the at least one processor, steps b) i), b) ii), and b)iii) until after a predetermined amount of time during which none of theplurality of user computing devices detects any user speaking.
 13. Themethod of claim 11, further comprising: during the networkedcommunication session and after the audio input subsystem of the firstuser computing device of the plurality of user computing devices detectsthe user of the first user computing device speaking: designating, bythe at least one processor, the first user computing device an activedevice; and designating, by the at least one processor, each of theother user computing devices of the plurality of computing devicesnon-active devices; and during the networked communication session andafter the audio input subsystem of the second user computing device ofthe plurality of user computing devices detects the user of the seconduser computing device speaking: designating, by the at least oneprocessor, the second user computing device the active device; anddesignating, by the at least one processor, the first user computingdevice a non-active device among the plurality of other non-activedevices.
 14. The method of claim 11, wherein the adjustment in steps a)i), a) ii), b) i), and b) ii) is to decrease volume or to mute volume.15. The method of claim 11, wherein the adjustment in step b) iii) is toincrease volume.
 16. The method of claim 11, further comprising:synchronizing, by the at least one processor, audio content and videocontent, including by delaying providing at least one of the audiocontent and video content to at least one of the user computing devices.17. The method of claim 16, wherein the delaying occurs as a function ofat least one BLUETOOTH signal.
 18. The method of claim 16, wherein thesynchronizing and/or delaying occurs as a function of at least datareceived in audio.
 19. The method of claim 18, wherein the data isusable by the at least one processor and/or at least one of theplurality of user computing devices for a time delay that preventsoutput of audio and input of audio occurring simultaneously on one ofthe plurality of user computing devices.
 20. The method of claim 16,wherein the synchronizing occurs as a function of differences inavailable bandwidth between the respective user computing devices.